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Abstract. In this paper we study the Metropohs algorithm in connection 
with two mean— field spin systems, the so called mean— field Ising model and 
the Blume-Emery-Griffiths model. In both this examples the naive choice of 
proposal chain gives rise, for some parameters, to a slowly mixing Metropo- 
lis chain, that is a chain whose spectral gap decreases exponentially fast (in 
the dimension of the problem). Here wc show how a slight variant in the 
proposal chain can avoid this problem, keeping the mean computational cost 
similar to the cost of the usual Metropolis. More precisely we prove that, 
with a suitable variant in the proposal, the Metropolis chain has a spectral 
gap which decreases polynomially in 1/A'^. Using some symmetry structure of 
the energy, the method rests on allowing appropriate jumps within the energy 
level of the starting state, and it is strictly connected to both the small world 
Markov chains of |15l 1161 and to the equi- energy sampling of |22| and |26| . 



1. Introduction. 

The Metropolis algorithm, introduced in [521 ^^'^ later generalized in [TB], is 
currently (together with other Monte Carlo Markov Chain methods) one of the 
most used simulation techniques both in statistics and in physics. See, among 

others, [33 EH ESma [33 Ell [SHE]. 

In a finite setting the Metropolis algorithm can be described as follows. Suppose 
that, given a probability tt{x) on a finite set X, want to approximate 

(1.1) fi = Y,f{x)7r{x), 

X 

for / : A" ^ M. As a first step, take a reversible Markov chain K{x, y) (the proposal 
chain) on X and change its output in order to have a new chain with stationary 
distribution tt. This can be achieved by constructing a new (7r-reversible) chain 

(12) Mix v)-l ^(^'y)^(^'y) ^ 
[i.i] (X, - 1 ^^^^ ^ ^^^^ ^^^^ ^^^^ _ ^^^^ ^ ^ ^ 

where A{x, y) :— min( ^|^|^|^'^j , 1). Then, the metropolis estimate of fi is given by 

1 " 

(1.3) An = -E/(^»)' 

1=1 

where Iq is generated from some initial distribution ttq and Yi, . . . , y„ from M(x, y). 

It is clear that, from a computational point of view, the speed of convergence to 
the stationary distribution and the (asymptotic) variance of the estimate are two 
very important features of the Markov chain M. 

It is well-known that in some situation a Markov chain can converge very slowly 
to its stationary distribution and, moreover, that the asymptotic variance of the es- 
timate (|1.3p can be much bigger than the variance of /, i.e. Varyrif ) ■— J2xifix) ~ 
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/Lt)^7r(x), which is equal to the asymptotic variance of the crude Montecarlo estima- 
tor. In these cases (|1.3p turns out to be a very inefficient estimate of /i. 

For the Metropohs chain a classical situation in which the convergence is slow 
(and the variance big) is when the target distribution tt has many peaks and K is 
somehow too "local" . 

This is well known in statistical physics, where, typically, a distribution of a 
system with energy function h and in thermal equilibrium at temperature T is 
described by the Gibbs distribution 

T^h,T{x) = exp{-/i(a;)/r}Zj;^ 

with Zt — X^a; cxp{— /i(a;)/T}. In point of fact, the Metropolis algorithm has been 
proposed in [55] to compute average with respect to such distributions. Indeed, 
if h is nice, the Metropolis algorithm is very efficient, but it can perform very 
poorly if the energy has many local minima separated by high barriers that cannot 
be crossed by the proposal moves K . This problem can be bypassed, for specific 
energy, designing appropriate moves that have higher chance to cut across the 
energy barrier (see, e.g, [Hll]), or constructing clever alternative approaches to the 
problem, for instance using a reparametrization of the problem (see, e.g., [12} [13]) 
or using auxihary variables (see, e.g., [101 [HI [11130] ) ■ A different kind of solution has 
been proposed in [TJ and in [55] by introducing the so called simulated tempering, 
which essentially means that T is changed (stochastically or not) to flatten h. A 
remarkable variant of these methods is the parallel tempering, see, for instance, [19] . 
More recently new algorithms based on the so called equi-energy levels sampling 
have been proposed (see [26] and [22]). In particular, the algorithm proposed in [22] 
relies on the so-called equi-energy jump, which enables the chain to reach regions 
of the sample space with energy close to the one of the starting state, but that may 
be separated by steep energy barriers. In point of fact, even if, according to some 
simulations, the method seems to be efficient nothing has been formally proved. 
Finally, let us mention a recent algorithm, called small world Markov chains (see 
[T51 [TB]). that combine a local chain with long jumps. In these papers, it has 
been shown that a simple modification of the proposal mechanism results in faster 
convergence of the chain. That mechanism, which is based on an idea from the 
field of small-world networks, amounts to adding occasional wild proposals to any 
local proposal scheme. 

In the present paper we study two simple examples: the so called mean field Ising 
model and the mean field Blume-Emery-Griffiths model. As for the former, it is 
well-known that the usual choice of K gives rise, for low temperature, to a slowly 
mixing Metropolis chain (see, e.g., [2S])- Here we show that a slight variant in the 
proposal chain can completely solve this problem, keeping the mean computational 
cost similar to the cost of the usual Metropolis. The idea again rests on allowing 
appropriate jumps in the same energy level of the starting state. As for the Blume- 
Emery-Griffyths mean-field model, we first show that there is a critical region of 
the parameters space for which the naive Metropolis chain is slowly mixing. Then 
we show how one can modify the proposal chain in order to obtain a better mixing 
for the Metropolis chain. The present paper should be intended as a further step in 
the direction of a better mathematical understanding of both small world Markov 
chains and equi-energy sampling. 

The rest of the paper is organized as follows. In Section [5] some general consid- 
erations are given. In Section [3] some basic tools concerning Markov chain, which 
will be used in the paper, are reviewed. Section |4] contains a warming up example. 
In Section the mean field Ising model is treated, while Section [0] deals with the 
more complex case of the mean field Blume-Emery-Griffiths model. All the proofs 
are deferred to the Appendix. 
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2. A GENERAL STRATEGY 

In an abstract setting, what we shall do in the next examples can be summarized 
as follows. Let Q he a. group acting on X for which 

(2.1) Tr{x) = TT{g{x)) V x G A", V .g e G. 

For every x in X let Ox {y — g{x) ■ g G} be the orbit of x (of course if y 
belongs to Ox then Ox — Oy). 

Assume now that we have a reversible Markov chain Ke{x, y) (the proposal) on 
X and suppose that the Metropolis chain Me with proposal Ke is slowly mixing 
(see next section for more details). To speed up the mixing one can try to exploit 
(|2.ip by taking a proposal of the following form: 

(2.2) K,{x,y)^eKE{x,y) + {l-e)Kg{x,y) 
where 

Kg{x,y) = ^ qx{z)h{y), 

< qx{z) < 1 and J2zeO^ = 1- 

In point of fact, usually Ke is "local"; for instance frequently 

KEix,y) = 

whenever y ^ x belongs to Ox, hence with Kg we are adding "long" jumps to the 
chain. Moreover, note that if Ke is such that Ke{x, g{x)) — KE{g{x), x), for every 
X in X and g in G, then the Metropolis always accepts the move x — > g{x) and 

M(x,5(x)) = eKE{x,g{x)) + (1 - e)qx{g{x)). 

In particular this holds when Ke is symmetric. 

The heuristics under ()2.2|) is to combining small world Markov chains and equi- 
energy sampling. 

Before presenting some examples in which one can actually improve the perfor- 
mances of the Metropolis chain using this idea, we collect in the next section some 
useful facts concerning Markov chains. 

3. Preliminaries 

Let P{x, y) be a reversible and ergodic Markov chain on the finite set X with 
(unique) stationary distribution p(a;). Thus, p(a;)P(a;, y) — p{y)P{y,x). Let L^{p) = 
{/ : A" — > R} with < f,g >p= Ep{fg) = f{x)g{x)p{x). Reversibility is equiva- 
lent to P : ^ being self-adjoint. Here Pf{x) = J2y f{y)P{^jy)- The spec- 
tral theorem implies that P has real eigenvalues 1 = Ao(P) > Ai(P) > A2(P) > 
••• > A|;f|_i(P) > —1 with orthonormal basis of eigen-functions ifji : X ^ M. 
{Pi>i{x) = Xiipiix), < ipt,i>-j >p= Sij). 

3.1. Spectral gap, variance and speed of convergence. A very important 
quantity related to the eigenvalues is the spectral gap, defined by 

Gap(P) = l-max{Ai,|A|^l_i|}. 

It turns out that the spectral gap is a good index to measure the mixing of a 
chain. To better understand this point, assume that / belongs to L^{p) and write 
fi.^) = J2i>o^i'^i(^) ('^ith CLi =< f,4'i >p)- Now let Yq he chosen form some 
distribution po and Fi, . . . , F„ be a realization of the P{x, y) chain, then 

1 " 
n ^ — ' 
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has asymptotic variance given by 

|2l + -^'c 



AVar{f,p,P) := lim n ■ yar(/i„) = ^ 



1 - Afe 
fe>i 

See, for instance, Theorem 6.5 in Chapter 6 of [S]. From the last expression, the 
classical inequality 

(3.1) AVar{f,p, P) < ^^Varp{f), 

follows easily. The last inequality is the usual way of relating spectral gap to 
asymptotic variance and, hence, to the efficency of a chain. 

The spectral gap is very important also to give bounds on the speed of con- 
vergence to the stationary distribution. For example, if || • \\tv denotes the total 
variation norm, one has 

\\S,P' - pfrv ^ ( \P'ix,A)-pix)\] < i-fM(max{Ai,|A|;,, i|})2'= 



ACX 



Ap{x) 



See, e.g.. Proposition 3 in [7 . Another classical bound is 

\\poP'/p - lh,p < Gap(P")bo/p - l||2,p 

valid for every probability po ■ See, for instance, [39j . 

Roughly speaking one can say that a sequence of Markov chains defined on a 
sequence of state space Xn is slowly mixing (in the dimension of the problem N) 
if the spectral gap decreases exponentially fast in N. 

3.2. Cheeger's inequality. As already recalled, problems of slowly mixing typ- 
ically occur when tt has two or more peaks and the chain K can only move in 
a neighborhood of the starting peak. Usually this phenomenon is called bottle- 
neck. A powerful tool to detect the presence of a bottleneck is the conductance and 
the related Cheeger's inequality. Recall that the conductance of a chain P with 
stationary distribution p is defined by 

h^h{p,P):^ inf — V p{x)P{x,y), 

A:p(A)<i V{A) ^^^^^^ 

and the well-known Cheeger's inequality is 

(3.2) 1 - 2/i < Ai(P) < 1 - y. 

See, for instance, [31I3Z1IZ]. Note that, since P is reversible, 

(3-3) < E E ^(-)^(-' = ^ E E ^^y)p^v^ 

for every A such that p{A) < 1/2. 

3.3. Chain decomposition theorem. In this subsection we briefly describe a 
useful technique to obtain bounds on the spectral gap: the so called chain decom- 
position technique. Following |16j assume that Ai, . . . , Am is a partition of X. 
Moreover, for each « = 1, . . . , m, define a new Markov chain on Ai by setting 

PAAx,y) -.^ P{x,y)+I,{y) {x,y£A,). 

PAi is a reversible chain on the state space Ai with respect to the probability 
measure 

Vi{x) := p{x)/p{A^). 
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The movement of the original chain among the "pieces" Ai , . . . , Am can be de- 
scribed by a Markov chain with state space {!,..., m} and transition probabihties 

for i j and 

which is reversible with stationary distribution 

p{i) -.^piA,). 

A variant of a result of Caracciolo, Pelisetto and Sokal (published in states 
that 

(3.4) Gap{P) > l-Gap{PH) ( min Gap{PA,) 

Z \ 2— l,...,m 

holds true, see Theorem 2.2 in [16] . Other results about chain decompositions can 
be found, for instance, in [20] . 

In the next very simple example we shall show how this technique can be used, 
starting from a slowly mixing chain, to suggest how to modify the proposal chain 
in order to obtain a fast mixing chain. 

4. Warming up example 
Set X — {~N, —N + 1, . . . , 0, 1, . . . , N} and define a probability measure on X 

by 

(g-l)gl-l 

''^^^ 29N+i + i-e' 

9 being a given parameter bigger than 1. Here we can consider Q = { + 1, —1} (with 
group operation given by the usual product) acting on X by g{x) — gx, hence 
Ox = {x,-x}. 

Now let Ke be a chain defined by 

Ke{x,x + I) = 1/2 x^N 

Ke{x,x -1) = 1/2 x^-N 

Ke{N,N)=Ke{-N,-N) = 1/2 

KE{x,y) = otherwise 

and denote by Me the Metropolis chain with stationary distribution tt derived 
by Ke- It is clear that in this case KE{x,y) = whenever y belongs to Ox- 
In this example it is very easy to bound the conductance on Me, indeed, taking 
A = {-TV, . . . , -1}, by (|^ . it follows that 

1 — 7r(0) 

h{TT,ME) < ce-^, 
1 - Ai < 2ce-^ - 

This means that, if / is such that ai ^ and ^ > 1, then the asymptotic variance 
of / blows up exponentially fast, indeed 

AFar(/,7r,MB) > 2Ce'°s(»)^. 



Hence, 

and then p.2p yields 
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Now, instead of Ke consider 

K,_{x,y) = (l-e)i^B(x,y) + eI{_,}(y) 
and let M*^*^^ be the Metropolis chain derived by K^. Decompose X as follows 

X = Ai\J A2 - ■ -yj An 
with Ai = {-1, 0, 1} and A^ ^ {x e X : \x\ = i}, for i > 1. Moreover let 

^W-nAj- j fori>l 

where 

2QN+1 j^l_Q 

and set 

For i 1,N, one has 

M^j^\i,i + 1) = -^^[MW(i, i + + M^'\-i, -i - l)7r(-i)] 

Zir[Ai) 

and, since 7r{i) = 7r{—i) and 7r(i + 1) > 7r(i) 

Mi;)(z,z + i) = l^. 

In the same way it is easy to see that 

M^j^\i,i) = l-^{l + e-') i^l,N 

m^j^\n,n-i) = 1^ m^\n,n) = 1 - ^ 

= 4(1 ! 1/(2^)) = ^ - WTVWY 

Moreover, for every i ^ 1, in matrix form is given by 

1-e e 

and hence 

GaKM^f) = l-|l-2e|. 

While is given by 

(20-l)(l-e)/(2e) (l-e)/(2^) 

(l-e)/2 e (l-e)/2 

(l-e)/(20) (2^-l)(l-e)/(20) 

and hence 

Gap{M'^]) = k{9,e) > -1. 

Moreover, since 



> min 



min (Ml;)(i, i±l)), M^j^\l, 2), mI^\n, N - 1) 
1 - e 



{l-e)/m 



4(1 + 1/(2^)) 



=: m{e,0) > 
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and 7r(j) < 37r(j) for every i < j, Lemma lA. II in the appendix yields that 

In the same way, since M'^\i, i + 1) + M'^\i, i - I) < (1 - e)Af (6')/4, with M{e) 
max(l + 0^^,29/(26 + 1)) < 2, inequahty (|XT|l in the Appendix yields that 



Hence 



Gap{M'j;>) > 



37V2 



and ([331) yield 



for a suitable h. This shows that M^*^^ is fast mixing for every e > and for every 
9 > 1 while Me is slowly mixing for every 9 > 1. 

5. The mean field Ising model 

Let X = { — 1, 1}^, N being an even integer. For every P > let tt — 7r^,jv be a 
probability on X defined by 

nix) = np,M{x) exp \ p^jMX z^\l3) (x G X) 



2N 
where 

is the normalization constant ( "partition function" ) and 

N 

Sn{x) ■.= ^Xi x= {xi,...,xn)- 

This is the so called mean field Ising model, or Curie- Weiss model, in which every 
particle i, with spin Xi, interacts equally with every other particle. It is probably 
the most simple but also the most studied example of spin system on a complete 
graph. The usual Metropolis algorithm uses as proposal chain 

1 ^ 

KE{x,y) = j;^'^hxU)}{y) 

where x^^' denotes the vector {xi, . . . , —xj, . . . ,xn). It has been proved in [2^1 that, 
whenever p > 1, 

1 - Ai < Ce-^'^ 

where Ai is the first eigenvalues smaller than 1 of the Metropolis chain Me derived 
Ke- This yields that the variance of an estimator obtained from this Metropolis 
algorithm can blow up exponentially fast in N. 

The aim of this section is to show how one can construct a different Metropolis 
chain avoiding this problem. In the notation of Section [21 we consider 

g = SNX {+1,-1} 

(Sn being the symmetric group of order N) and we define the action of Q on 
A- = {-1,1}^ by 

g{x) = (e • x<^(i), . . . , e • x„(^n)) 9 = {cr, e). 
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In order to introduce a new proposal, it is useful to write X as the union of its 
"energy sets" , that is 

A" = U A's U A'4 U • • • U AW 

where 

X,:={xeX: \Sn{x)\ = i} (z = 0, 2, . . . , N). 

Note that energy takes only even values and that Ox = X\Sj_^(^x)\- Moreover, for 
i ^ 0, set 

A'+ -.^ {x e X : SNix) = i} and X~ -.^ {x e X : S'Ar(x) = 
The new proposal chain will be 

K{x,y) ^ piKE{x,y) + (1 - pi)Ka{x,y) if x e Xq 
(5.1) K{x,y) ^piKE{x,y) +P2^-x}{y) + {I - Pi -p2)K^{x,y) 

if X e Xi, i 

where pi,P2 belong to (0, 1), pi + P2 < 1, and 

K,ix,y) ^I^+{x}K+ix,y) +I^-{x}Kr(x,y) (i + 0). 

We shall assume that Kf' {Kq, respectively) are irreducible, symmetric and aperi- 
odic chains on X^ ( Xq, respectively). 
As a leading example we shall take 

KQ{x,y) = —^ y e Xo 



(5.2) 



( ^ ) 

\N/2) 



Kt(x..y) 



1 



((jV-4)/2) 



that is: a realization of a chain Ki^ {Kq, respectively) is simply a sequence of 
independent uniform random sampling from X^ [Xq, respectively). 

Remark 1. Note that i5.^) is the {n, k)-Bose-Einstein distribution with n — {N + 
i)/2 and k = {N — i)/2 + 1 and recall that there is a very easy way to directly 
generate Bose-Einstein configurations. One may place n halls sequentially into k 
boxes, each time choosing a box with probability proportional to its current content 
plus one. Starting from the empty configuration this results in a Bose-Einstein 
distribution for every stage. 

Now let M be the Metropolis chain defined by the transition kernel ()1.2p with 
K as in (|5.ip . i.e. for every x in X^ {i 7^ 0) 



' ^nrin(l,^) 



M{x,y) 



while for x in Xq 



ify = x(j), j = l...N 

P2 if y = -X 

{l~pi~P2)Kf{x,y) ifyeX^,y^x 



M{x,y) 



^-E.^xM{x,z) 
< {l-pi)Ko{x,y) 



if y = X 

ify^xW, j = 1...7V 
if y e Xo,y ^ X 
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By construction M is an aperiodic, irreducible and reversible chain with stationary 
distribution tt. Then, when ()5.2p holds true. 



' Eimin(l,^) ify = xO), J^\...N 



M{x,y) = < 



P2 



\i y — —X 



(1-^1-^2)7 — K — r ifyeX,,y^x 

\(N-i)/2) 

^-J2z^xM{x,z) ify = x 

for X in Xf^ (i 7^ 0), while if x belongs to Xq 

' ^min(l,^) ify = xW, 3 ^ 1...N 

M{x,y)^l {l-pi)j4rT iiyeXo,y^x 

[n/2) 

^-J2z^xMix^z) if 2/ = a;. 

In order to bound the spectral gap of M we shall use the decomposition theorem 
described in Subsection 13.31 To this end, for every i — 0,2, . . . , N and every j i 
set 

Pihf) ■■= E E Mix,y)7rix) 

and 

P{i,t) :=l-^P(*,j). 

As already noted, P is a reversible chain on {0, 2, ... , N} with stationary distribu- 
tion 

7f(i) := tt{X,). 

Moreover define for every i = 0,2, . . . , N a. chain on Xi setting 



Px,{x,y) ■.= M{x,y)+Uy) ^ M{x,z) 



where both x and y belong to Xi. In the same way, define chains on X^ and X^ 
for i = 2, . . . , N setting 



and 



Px± (a;, y) ■■= Pxi {x, y) {y^x, x,y & xf) 



Px±{x,x) -.= 1- ^ Px,{x,y). 



These chains are reversible on Xi {Xf^ , respectively) and have as stationary distri- 
butions 

'^•^.W •= rZVT = "HTT and tt^±{x) _ , 7—^ 
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respectively. Finally, for every i — 2,4, . . . , N , define a chain on {+, — } setting 

Now the lower bound (|3.4p . applied two times yields 
Gap{M) > ^Gap{P) min ^{Gap(PA'J} 

(5.3) > ^Gap(P) min [Gap(PA-o), 



min^^ "I —Gap{Pi) min{Gap(P;^,+ ), Gap(P;^,-)} 



Hence, to get a lower bound on Gap{M) it is enough to obtain bounds on the gaps 
of the chains P, Pxg, Pi, Px^- 

The most important of these bounds is given by the following 

Proposition 5.1. P is a birth and death chain on {0, 2, . . . , N}, more precisely 
P(0,2) = 2^ 

(5.4) P(^^, + 2) = ^f^ i^N,Q 

P(,,,_2) = 2^i^exp{2/3(l-*)/iV} z^O. 

Moreover 

Ai(P)<l-fi^^^2 + l)3- 

and 

The proof of the previous proposition is based on a bound for a birth and death 
chain, given in the Appendix, which can be of its own interest. 
As for the others chains, we have the following 

Lemma 5.2. For every i — 2,A, . . . ,N 

GapiP;^±) > (1 -pi -p2)Gap{K^) 

Gap{Pi) = p2, 

moreover 

Gap{Px,) > {l-pi)Gap{Ko). 

In this way, using (j5.3p , we can prove the main result of this section. 

Proposition 5.3. Let M be the Metropolis chain derived by the chain K defined 
as in h5.1\) then 

GapiM) > _^^^^^_^mm ^__Gap(i^o), 

(1 -Pl -P2 



2 i^o 
If Kf^ and Kq are defined as in i5.2\) then 

GapiM) > — ^^^2 + 1)3 - 
for every /3 > and N > Nq. 



rmiimm{Gap{K^),Gap{K^ )} 



(I-P1-P2) (1-pi) 



P2 
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Proposition 15.31 shows that the gap is polynomial in independently of (3. 
Hence, even when > 1, the variance of the metropolis estimate obtained with this 
proposal can not grow up faster than a polynomial in N. 

Note that if in Proposition 15.31 we choose 

(5.5) pi = l-a/{2N), p2^a/N 

we get 

Gap{M) > ^. 

Hence, even with this choice, the Metropolis algorithm is still fast mixing for every 
p. It is worth noticing that the mean computational cost of this Metropolis does 
not change with respect to the Metropolis which uses the proposal Ke- Indeed, 
in the case of the usual Metropolis, the computational cost needed to go from Xn 
to Xn+i is 0{N), since it is essentially due to a sample of one number among N 
numbers (we need to decide which coordinate to flip). In the case of the "modified" 
proposal, things are slight more complex. In this case, at the beginning, we have 
an extra "toss" . If with this fist toss we decide to flip at random a coordinate the 
cost is still 0{N) but if we need to sample from Kf the cost is 0{N'^) (in this last 
case we need to pick a sample from a Bose-Einstein distribution). Hence, although 
our algorithm is "sometime" more expensive, if we take pi and p2 as in (|5.5|) . we 
get that the mean cost of our algorithm is still 0{N). 



6. The mean-field Blume-Emery-Griffiths model 

The Blume-Emery-Griffiths (BEG) model (see [2 ) is an important lattice-spin 
model in statistical mechanics, it has been studied extensively as a model of many 
diverse systems, including He^ — He^ mixtures as well solid-liquid-gas systems, 
microemulsions, semiconductor alloys and electronic conduction models. See, for 
instance, [21 [351 [121 [23 [33 [3S1 [H] ■ We will focus our attention on a simphfied 
mean-field version of the BEG model. For a mathematical treatment of this mean- 
field model see [TU]. In what follows let X := {—1, 0, 1}^, N being an even integer, 
and for every /? > and K > Q let TTp^K,N be the probability defined by 

7r(x) = -Kp^K^Nix) = exp{- 
where 

Zn{P,K) = Zn 
is the normalization constant, 

N 

Sn{x) :='^^Xi and 

i=l i=l 

A natural Metropolis algorithm can be derived by using the proposal chain 

1 ^ 

(6.1) KE[x,y) ^ —Y,[h.w)}(y) + h.^-^y}iv)] 

3 = 1 

where x'^^^^ denotes the vector (xi, . . . , ± 1, . . . , xn), with the convention that 
2 = -1 and -2 1. 

The next proposition shows that there exists a critical region of the parameters 
space in which the Metropolis chain is slowly mixing. More precisely, using some 
results of [TU] it is quite straightforward to proove the following 



(3Rm{x) + ^SI{x)}Z],\P,K) [x^X) 



N 



E 



e,cp\-PRN{x) + ^S%{x) 



N 

Rn{x) ■.^'^xj X = {xi,X2,-.-,Xn)- 
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Proposition 6.1. Ler Me he the Metropolis chain (with stationary distribution -k ) 
with proposal chain Ke defined in ()6.ip . Then, there exists a non decreasing Junction 
r : (0, +00) — > (0, +00) with ym\x^{)T{x) = +00 and \mix^oo^{x) — jc — 1-082 
such that for every couple of positive parametrs {(3,K) with K > T{[3) 

Gap{ME) < Ce-^^ 
for suitable constants C = C{'y, K) > and A = A(7, K) > 0. 

As in the case of the mean-field Ising model, we intend to by pass the slowly 
mixing problem of this Metropolis chain by choosing a different proposal. To un- 
derstand which kind of proposal is reasonable, here we choose 

g = SN^ {+1,-1} 

with g acting on X = {-1, 0, 1}^ by 

5(2;) = (e • a;^(i), . . . ,e • a:;^(Ar)) g = (cr, e). 
At this stage, decompose X as the union of its " energy sets" , that is 

X = Xq o U Xi^i U Xo^2 U Xi 3 U ^3^3 U ... U Xq^n U X2^n U ...Ajv.Af 

where 

Xs r {x ^ X : \Sn\ = s and Rn{x) = r} 
r — 0,1,2, .... N and s = l,3,...,r if r is odd and s = 0,2,...,N if r is even. 
Moreover, for 5 = 1,2, N, set 

X^j. -.^ {x E X : Sn = s and Rn{x) = r} 

and 

'^s r ■— {x E X : Sn — —s and Rn{x) — r}. 

Note again that Ox — Xs^r with s = Sn{x) and r = Rn{x). The new proposal 
chain will be 

K{x,y)=piKE{x,y) + il-pi)Ko^r{x,y) if x e Xo,r, r = 0,2,...,iV 
(6.2) K{x,y) =piKE{x,y) +P2^-x}{y) + (1 -pi - P2)Ks,r{x,y) 

if X e Xs^r, s 7^ 
where pi,p2 belong to (0, 1), pi + P2 < 1, and 

K.^rix, y) = lx+^\x}Kl,{x, y) + lx-^{x}K-^.{x, y) [s ^ 0) 

with 

KqA^, y) = (N\( r \ y € '^0,r 

\r)\r/2) 



(6.3) 

A / r \ 
-)\(r-s)/2) 

Now let M be the Metropolis chain defined by the transition kernel (|1.2p with K 
as in (|6.2p . i.e. for every x in Xf"^^ (s ^ 0) 

' i^min(l,^) ify = x(±^), j^l...N 

P2 if y = —X 



M{x,y) = < 



{l-pi-p2)jTTjj^^-F^ ifyeX^^,yy^x 
^-J2.^xMix,z) ify^x. 
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while if X belongs to Aq,, 



' ^min(l,^) i{y = x(^^\ j = 1...N 



M{x,y) = i 



By construction M is an aperiodic, irreducible and reversible chain with stationary 

distribution tt. 

Also in this case, to bound the spectral gap of M, we shall use the chain decom- 
position tools. Let 

Bat = {(0, 0), (1, 1), (0, 2), (2, 2), (1, 3), (3, 3), (0, 4), (2, 4), (4, 4), (0, AT), (2, iV), ^)} 
and, for every couple (s,r), (s,f) in Djv, with (.s,r) ^ (s,f), let 

P((s,r),(s,f)):=— ^ ^ ^ M{x,yMx) 

and 

P{{s,r),{s,r)):=l- ^ r), (s, f)). 

(s,f )7^(s,r) 

Once again, note that P is a reversible chain on Pjv with stationary distribution 

7f(s,r) := ■n-{Xs,r)- 
Moreover, for every (s,r) in D^r, define a chain on Xg^r setting 

PxU^,y):=M{x,y)+Uy)l ^ M{x,z) 



where both x and y belong to Xg^j-. In the same way, define chains on and X^ 
for (s, r) in Djv, s 7^ 0, setting 



-Py±, {x,y) ■■= Px,,Ax,y) {yi^x,x,ye xf^^) 



and 



:= 1- ^ ^'^'^..(a;, y). 

These chains are reversible on X^^j. {X^^, respectively) and have as stationary dis- 
tributions 

7r(a;) 1 , , . t^x.^Ax) I 
T^x,A^) := = and i^xtM := — ^ = 

respectively. Finally, for every (s, r) in Djv, s 7^ 0, define a chain on {-|-, — } setting 

PsA+^-)-=7, TTrrT X) X) ^A^3,,(a;,j/)7rA',,,(a;) 

n,r(-,+):=T \—- Yl Yl Px,A^,y)T^xUx). 
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At this stage, the lower bound 



apphed two times, yields 



Gap{M) > IcapiP) min {Gap{Px )] 
(6.4) >^Gap{P) mm ^_mm ^{Gap{Px„ J} , 

min I -G'ap(Ps.r) min{ Gap(P^+ ),Gap{P^- )} 

To derive from the last bound a more explicit bound we need some preliminary 
work. The first result we need is exactly the analogous of Lemma 15.21 

Lemma 6.2. Fore every r = 1, . . . , N 

Gap{PxoJ > {1 - Pi)Gap{Ko^r) = (1 -pi), 
moreover, for every (s,r) in ©at with s 7^ 0, 

GapiP^J > (1 - Pi - P2)Gap{Kt^,) = (1 - pi - P2). 
Finally, for every (s,r) m DjV; 

Gap{Ps,r) = P2- 
Hence, (|6.4p can be rewritten as 



(6.5) Gap{M) > Gap{P)^ mm{{l - pi)/2, {1 - pi - P2)/2}. 

It remains to bound Gap{P). Unfortunately the the analogous of Proposition 15.11 
is not so simple, hence we shall require an additional hypothesis. In what follows 
let 



q\[N\\{r) := (^le 



2E 



r\ kp 



if r is even 



-fir 



2E 



r \ ft/3 



if r is odd 



r = 0, 1, . . . , iV and set 

A= {13 > 0,K > Q : 3Nq such that VA^ > iVo, qiM\\ is unimodal}. 
Lemma 6.3. For every {f3, K) in A 

GMP) > ^ 

for a suitable constant G — C{(3, K). 

Under the same assumptions of the previous Lemma we can state the main 
results of this section. 



Proposition 6.4. For every {(3, K) in A 

Gap{M) > 



Are 



for a suitable constant C = C((3, K). 
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P=2.3, K=D.5,0.6,0.7,0.8 P=2.3,2.4,2.5,2.B, K=0.5 




Figure 1. The function q\iN]\ for = 15 and few values of and K. 

We conjecture that Gap{P) is polynomial in N for every (/?, K) such that (3 ^ 
r(/ir) (where F is the function of Proposition [OJ, but we are not able to prove this 
conjecture. In point of fact we conjecture that x R+ \ {(/3, K) : r(if ) — (i\ d A. 
We plotted <1\n\ for different iV, /3 and and these plotts seem, at least, to confirm 
that M+ X M+ \ {(/3, K) : \T{K) - /3| < e} C for a suitable small e. In Figure 1 
we show the graph of q\[N]\ for few different N , (3 and K . 

Appendix A. The Spectral Gap of a Birth and Death Chain 

We derive here some bounds on the eigenvalues of a birth and death chain that we 
shall use later. These bounds are obtained using the so called geometric techniques, 
see [7]. Let P„ be a birth and death chain on f2„ = {1, . . . Assume that P„ 
is reversible with respect to a probability p„, that is p„(z)P„(i, j) = Pn{i)Pn{j,i)- 
Moreover let 

1 > Ai > A2 > ...A„_i > -1 

the eigenvalues of P„. 

We can now prove the following variant of Proposition 6.3 in [6]. 

Lemma A.l. // there exist positive constants A, q, B and an integer k such that 
P„(i,i ± 1) > An"« {i^l.n) 
P„(l,2) > 

Pn{n,n- 1) > An~'^ 

and 

Pn{i) < Bpn{j) i< j <k 
PnU) < Bpn{i) k <i < j 

then 

A 1 
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Proof. We use the notation and the techniques of [7] , see also [3] and [5] . Choose 
the set of paths 

and for e — + I) {i < n) let 

.Pn{l)Pnim) 



^(e) = — y 



17/,: 



where I7I is the length of the path 7. Setting K := sup^ ipie) one has 

A.<1-1 

(see Proposition 1' in [7], or Exercise 6.4 page 248 in [3]). So, for our purposes, it 
suffices to give an upper bound on K. Assume first that e = (i, z + 1) with 
i < k < n, since |7;,m| < it follows that 



/ 



E 



Pn(?')p«(s) 



I a>i+l 



Pn{i) 



, r<2 ^ '' / \s>'i+l 



, r<i J \s— 1 



~ A 

All the other cases can be treated in the same way. Hence, 

supV'(e) < -zn'^^'^ 

e A 

and then 

A 1 



Ai < 1 - 



□ 



As for the smaller eigenvalues, Gershgorin theorem yields that 

A„-i > -1 + 2minP(i,?). 

i 

See, for instance. Corollary 2.1 in the Appendix of 3"^. Hence, if there exists a 
positive constant D such that 

F„(z,i + l)+P„(i,i-l)<i^/2 

for every i, then 

(A.l) A„_i >\-D. 
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Appendix B. Proofs 
To prove Proposition 15. II we need first to show that tt is essentially unimodal. 
Lemma B.l. Let 



iNii) = N 



N 



z = 0,2,4,. 



For every /3 < 1 there exists an integer Nq such that for every N > Nq 

qN{i) < QnU) 

whenever j < i. For every f3 > I there exists an integer Nq such that for every 
N >No 

qN{i) < QnU) 

whenever i < j < kN and 

qN{i) > QnU) 
whenever fcjv l£ i l£ j , being a suitable integer. 

Proof. Let AAr(i) be the ratio 

qN{i + '2) 



qN{i) 



z = 0,2, 4, ...,7V -2, 



so that 



N 



AAr(i) = 



1^ ) 2/3,, 



(4^) 

N-i [2/3,, ., 

= exp < — [l + i) 

N + 2 + i ^\N^ ' 

Setting 1^n{x) = jy^^+a ^xp (-'^ + ^ ™ [0, iV — 2], it is enough to prove that 
X Ajv(a;) takes the value 1 at most once in [0, — 2], for sufficiently large N. 
To prove this last claim first note that 



Ajv(O) 



N 



2(3 



exp , 

iV + 2 ^ I iV 

2 / 2 
1 + 2 — 

N \N 



, 2/3 

f 

2 / ,o2 



O 



2/? 1 /2/3 



N 



N 



A2 



Hence, there exists Nq in N such that for N > N^: 



/3> 1 
/3 < f 



Ajv(O) > 1 

AAr(O) < 1. 



As for the first derivative note that 

-2iN + l) + 2PiN + 2)-f{x^+2x) (2P 

{N + x + 2)^ exp|^(f+xj 

hence A'j^{x) = if and only if 

2/3. 2 



-2(A + f ) + 2/3{N + 2)-j^ix'+ 2.x) = 0. 
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Rearranging the last equation as 

2/3 2 4/3 



-^a:^-^ + 2[(/3-l)iV + 2/3-l] = 



one sees that the roots are 



Hence, after setting 



r:=l + yi+ ^ ^ N+^-^N^ and f 1 + VT+N 
one has 

P <1 => A'j^{x) < Vx e [0, iV - 2] 

P>1 ^ A^(x)>0 forxe[0,r) 

A^(a;)<0 for X e (r, /V - 2] 

/3 = 1 A'j^{x)<0 forxe[0,r) 

A'^(j;)<0 forxe(r,/V-2] 

and this concludes the proof. □ 

Proof of Provosition [571\ By direct computations it is easy to prove (j5.4p . Hence 

Pi ^ Pi ^ Pi 



P{z,i±2)> > 4(^_^2) > 8(1 + 1)' 
and 

Pii,i + 2)+P{i,i-2) < ^. 

Now observe that 
and 

2 

ZAr(/3j 

Hence, by Lemma [B. 11 if /3 < 1 

7t{i) < 2^(j) 

whenever j < i and N is large enough. While for /3 > 1 
whenever i < j < and 

7r(j) > 7f(j) 

whenever < i < j. The thesis follows now by Lemma [A . II and by (|A.ip . □ 
In order to prove Lemma 15.21 we recall that by Rayleigh's theorem 

(B.l) 1 - Ai(P) = inf : / nonconstant 1 

I Varpif) J 

where 

£p{f, f) :=< (/ - P)f, f >p= i ^(/(x) - f{y)rPix, y)p{x), 
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P being a reversible chain w.r.t. p, moreover 

(B.2) 1 — Ajv_i = mt < — — : / nonconstant > 

[ Varp(f) J 

(see, for instance, Theorem 2.3 in Chapter 6 of [3j and Section 2.1 of [8 ). At this 
stage set 

Hence, (jB.ip yields 

, ^E.,,(/W-/(2/)m(x,y)p(x) 

l-Ai(Pe)= mf 2 — 

feLp^const Varpif) 

= mf (1 - e) — 77 tt{ 

feLp^const Varpif) 

= (l-6)(l-Ai(P)). 
Arguing in the same way and using (|B.2|) we get 

l-|A|;,|_l(P,)| >(l-e)(l-|A|;,|_l(P)|). 

Hence, 

(B.3) Gap{Pe) > (1 - e)Gap{P). 

Proof of Lemma \5.2\ Note that 



Gap{P^±) > (1 - pi - p2)Gap{Kf) 



Px± {x, y) = {l-p,~ P2)Kf{x, y) + {pi + P2)lx{y) 
and, analogously, 

Pxo{x,y) = (1 -pi)Ko{x,y) +pilx{y)- 

Hence, by (|B.3 
as well 

Gap{Px„) > {I ^ pi)Gap{Ko) 
Finally note that Pi is given by 

1 _ P2 P2 

2 2 

£2 1 _ P2 

2 ^2 

for every i, hence Gap{Pi) — p2- □ 



Proof of Proposition 15.31 To prove the first part of the proposition it is enough to 
combine Lemma Proposition 15 . 1 1 and ()5.3p . To complete the proof observe that 
Gap{Kf) ^ Gap{Ko) = 1, when Kf and Kq are given by (|5.2p . □ 

In order to prove Proposition 16.11 we need some results obtained in |10| . 

Theorem B.2 (Ellis-Otto- Touchette). Let be the distribution of Sm{x)/N un- 
der irp^K.N , then pn satisfies a larye deviation principle on [—1, 1] with rate function 

ip.Kiz) = Mz) - l3Kz^ - M{Jfj{t) - f3Kt^} 

with 

Jl3[z) = sup <tz- log . , o -0 

Moreover, if Ep^K argminlp^K , then there exists a non decreasing function T : 
(0, -|-oo) — > (0, -|-oo) with lim2;^or(2:) = +cx) and \mix^oo^{x) = 7c — 1.082 such 
that for every (/3, K) with K > r(/3) then 

£p,K = {±z{(i,K)^0}. 



20 



FEDERICO BASSETTI AND FABRIZIO LEISEN 



In particular, for such K) and for every < e < |z(/3, K)\ there exists a constant 
Ci = Ci(e, K) such that 

N 

(B.4) p([o,e])<Ciexp{-y7,,^,K} 
with 

(B.5) 7e,;3,if = inf ii3,K{z) > 0. 

ze[0,e] 

Proof For the first part see Theorems 3.3, 3.6 and 3.8 in [10]. As for (|R4|-(jE5]), 
they are standard consequences of the theory of the large deviations and of the first 
part of the proposition, see, e.g., Proposition 6.4 of [IT]. □ 



Proof of Proposition \6.1[ We intend to use the Chegeer's inequahty. To do this, let 
A:={x: SNix) < 0}, B := {x : 5*^(0:) > 0}, C := {Sn{x) = 0}. First of aU note 
that, by symmetry, tt{A) = Tr{B) = (1 — 7r(C))/2 < 1/2. The main task is to bound 

Now, observe that if S'jv(2/) > 1 then Msiy, x) = for every x in A, hence 
cbiA)= <y)Y.^'^E{y,x)+ 7r(y)^Af£(y,a;) 

l/:5jv(y)=0 xeA y:Sjv(y) = l x^A 

<^{y:SN{y)e{0,l}}. 
This yields a bound on the conductance 

l-7:{y: SN(y) = 0} 

Now by Proposition IB. 21 we get 

h{TT, Me) < C2e~^^ 

for suitable constants C2 and A > 0. The thesis follows by Cheeger inequality 
(lOl. □ 



Proof of Lemma 1 6. 2\ The proof is exactly the same as the proof of Lemma 15. 21 □ 



In order to prove Lemma 16.31 it is convenient to fix some simple properties of the 
chain P. 

Lemma B.3. P is a random walk on ©at. If P{{s,r), {s,r)) ^0, 

Piis,r),i~s,r))>P^ 
for a suitable constant C3 = C^{j3,K), moreover 

P((.,r),(5,f))<^ 
for ez;ery (s,r),(S,f)) ^ ((0,0),(1,1)). 

Proof of Lemma \B.IA Easy but tedious computations show that 
P((0, 0), (1, 1)) = ^ min f 1, exp{:^ - /?} 



N 



F((0,iV),(l,iV-l)) = ^ 
F((0,iV),(2,iV)) = ^ 
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P((0,r),(2,r)) = |L r = 0,2,4,...,7V-2 
P((0,r),(l,r-1)) = |L r = 0,2,4,...,iV-2 

P((0, r), (1, r + 1)) = ^ min (^1, exp{:^ ~ r = 0, 2, 4, TV - 2 

P((.,r),(. + 2,r)) = £L(,_,) 

(s, r) e Dat, < s < iV - 2, r < iV 

P((s, r), (s - 2, r)) = ^(r- + .) exp{4:^(l - .)} 
(s,r) e DAr,0 < s < N,r < N 

P{{s, r),{s + l,r + l)) = ^{N- r) min (^1, exp{:^(2s + 1) - /3} 

(s,r) e ]n>N,0 < s,r<N-l, 

Piis, r), is-l,r+l))^^iN- r) exp{:^(-2,s + 1) - /?} 

(s,r) e Dat, < s,r < iV- 1, 
P{{s,r),{s + \,r-l)) = ^{r-s) 

(s, r) e ©jv, < r < iV, < s < - 2 
r), (s - 1, r - 1)) = + s) min (^1, exp{:^(2s + 1) - /?} 

(s, r) e Djv, <r<iV,0<s<r. 
At this stage the statement follows easily. □ 



Proof of Lemma \6.3[ In order to obtain a bound on the gap of P we shall apply 
another time the decomposition technique. Write 

©AT = A^i U U U ... U Xn, 

where 

Xi = {(0,0), (1,1)} Xr = {(ui,U2) G e„ : U2 = r}. 
On \N\ := {1, .... iV} define a chain P\[n\\ setting 

and 

P\[N\\{i.i) 1 -E^K^iK^'-^^- 
Again P|[Ar]| is a reversible chain on |iV]| with stationary distribution 

T^\[N]\{i) T^{Xi). 

Finally for every r = l,2,...,A^we define a chain on Xr by setting 



PxMM ■■^Pia,b)+Ub) E ^(«'^) 



where both a and 6 belong to Xr- Now note that for every r = 2, 3, . . . , Pj^ is 
a birth and death chain on the state space {(1, r), (3, r), . . . , (r, r)} for r odd and 
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{(0, r), (2, r), . . . , (r, r)} for r even. Let 



0K 2 



and, for r even, 
Now observe that P;f has stationary distribution 

TTris) CX qr{s) 

with s = 0, 2, . . . , r if r is even and s = 1, 3, . . . , r if r is odd. First of all let r 7^ 1, by 
Lemma IB. 31 and Lemma iB.!) it is easy to check that (P;p ,77^) meets the condition 
of Lemma lA.ll with 

B = 2, n=[(r + 2)/2], ^ = [(r + 2)/2] A^"! 

{[x] being the integer part of a;) and then 

C3Pi[(r + 2)/2] ^ C3P1 
1 - Ai(P^J > 2Ar[(^ + 2)/2]3 - 2iV^- 

Finally, Lemma fB.SI with (|A.1[) yields 

A|;e,|-i(P;e„)>l-pi. 
Hence, for every r 7^ 1, we have proved that 
(B.6) GapiPxJ > C3/2piN-\ 

For r = 1 



l-ai/2 ai/2 
02/2 1 - 02/2 



where 



So 



a2 :=pimin n,exp{^ -13} 



Gap{Px^) > 1 - I- 



2 ' 2 

where the last equality follows from the fact that ^ < 5 and ^ < ^. Hence, for 
sufficiently large iV, it's easy to see that 

(B.7) Gap{Px,)>CiPiN~'' 

with C4 = C4(/3,/4:). At this stage (|R6l) with (IBT]) gives 

(B.8) Gap{Px;) > 

for all r e |A^]| . As for the gap of P\[n]\ : fii'st of all note that P\[n]\ is a birth and 
death chain on |A^]| . From Lemma IB. 31 

P (,,^+l):=^^^y y P(a,bMa) > P'^' ^ V V 7r{a) > ^ 
^ 27f(A;) ^ ^ ^ ' ^ ^ ^ - iV 27f a; ^ ^ ^ ' - 2N 

and analogously, 

u !■ ■ 1^ ^ Pi<^3 

P|[JV]|(«,*-1) > 

Now, for r 7^ 1 

N 

nmi^) = (imir) / C^q\[N]\{i)) 

i=0 
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while 



N 



= (9^(1) + g|[w]|(0))/(^g|[^r]|(^))• 



So, using the unimodality of 9|[jv]|, we can apply Lemma IB. 31 with 




which gives 




Using another time Lemma FB .31 by (jA.ip . we get 

^Ar(-P|[Ar]|) > 1 - Pl- 




C being a suitable constant that depends by /3, K, C^^Ci^C^. 
Proof of Proposition \6.4\ Combine Lemma 16.31 with 16.51 



□ 



□ 
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